Online Learning with Sample Path Constraints

نویسندگان

  • Shie Mannor
  • John N. Tsitsiklis
  • Jia Yuan Yu
چکیده

We study online learning where a decision maker interacts with Nature with the objective of maximizing her long-term average reward subject to some sample path average constraints. We define the reward-in-hindsight as the highest reward the decision maker could have achieved, while satisfying the constraints, had she known Nature’s choices in advance. We show that in general the reward-in-hindsight is not attainable. The convex hull of the reward-in-hindsight function is, however, attainable. For the important case of a single constraint, the convex hull turns out to be the highest attainable function. Using a calibrated forecasting rule, we provide an explicit strategy that attains this convex hull. We also measure the performance of heuristic methods based on non-calibrated forecasters in experiments involving a CPU power management problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Learning with Constraints

We study online learning where the objective of the decision maker is to maximize her average long-term reward given that some average constraints are satisfied along the sample path. We define the reward-in-hindsight as the highest reward the decision maker could have achieved, while satisfying the constraints, had she known Nature’s choices in advance. We show that in general the reward-in-hi...

متن کامل

Online Multi-task Learning with Hard Constraints

We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks. The tasks are related, which is modeled by imposing that the M–tuple of actions taken by the decision maker needs to satisfy certain constraints. We give natural examples of such restrictions and then discuss a general class of tractable constraints, for which we introduce computationally effici...

متن کامل

An Optimized Online Secondary Path Modeling Method for Single-Channel Feedback ANC Systems

This paper proposes a new method for online secondary path modeling in feedback active noise control (ANC) systems. In practical cases, the secondary path is usually time-varying. For these cases, online modeling of secondary path is required to ensure convergence of the system. In literature the secondary path estimation is usually performed offline, prior to online modeling, where in the prop...

متن کامل

A Robust Feedforward Active Noise Control System with a Variable Step-Size FxLMS Algorithm: Designing a New Online Secondary Path Modelling Method

Several approaches have been introduced in literature for active noise control (ANC)systems. Since Filtered-x-Least Mean Square (FxLMS) algorithm appears to be the best choice as acontroller filter. Researchers tend to improve performance of ANC systems by enhancing andmodifying this algorithm. This paper proposes a new version of FxLMS algorithm. In many ANCapplications an online secondary pat...

متن کامل

Correlation between Online Learner Readiness with Psychological Distress related to e-Learning among Nursing and Midwifery Students during COVID-19 pandemic

Introduction: With the sudden shift of face-to-face education to e-learning during the COVID-19 pandemic, awareness of learnerschr('39') readiness for online learning and its impact on studentschr('39') psychological distress related to e-learning is important for teachers, counselors, and educational planners. Therefore, the present study was conducted to investigate the correlation between on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2009